Exploiting the query structure for efficient join ordering in SPARQL queries

نویسندگان

  • Andrey Gubichev
  • Thomas Neumann
چکیده

The join ordering problem is a fundamental challenge that has to be solved by any query optimizer. Since the high-performance RDF systems are often implemented as triple stores (i.e., they represent RDF data as a single table with three attributes, at least conceptually), the query optimization strategies employed by such systems are often adopted from relational query optimization. In this paper we show that the techniques borrowed from traditional SQL query optimization (such as Dynamic Programming algorithm or greedy heuristics) are not immediately capable of handling large SPARQL queries. We introduce a new join ordering algorithm that performs a SPARQL-tailored query simplification. Furthermore, we present a novel RDF statistical synopsis that accurately estimates cardinalities in large SPARQL queries. Our experiments show that this algorithm is highly superior to the state-of-the-art SPARQL optimization approaches, including the RDF-3X’s original Dynamic Programming strategy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Developing a BIM-based Spatial Ontology for Semantic Querying of 3D Property Information

With the growing dominance of complex and multi-level urban structures, current cadastral systems, which are often developed based on 2D representations, are not capable of providing unambiguous spatial information about urban properties. Therefore, the concept of 3D cadastre is proposed to support 3D digital representation of land and properties and facilitate the communication of legal owners...

متن کامل

Relational Databases Query Optimization using Hybrid Evolutionary Algorithm

Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...

متن کامل

A Join Operator for Property Graphs

In the graph database literature the term “join” does not refer to an operator combining two graphs, but involves path traversal queries over a single graph. Current languages express binary joins through the combination of path traversal queries with graph creation operations. Such solution proves to be not efficient. In this paper we introduce a binary graph join operator and a corresponding ...

متن کامل

Efficient Execution of Top-K SPARQL Queries

Top-k queries, i.e. queries returning the top k results ordered by a user-defined scoring function, are an important category of queries. Order is an important property of data that can be exploited to speed up query processing. State-of-the-art SPARQL engines underuse order, and top-k queries are mostly managed with a materialize-then-sort processing scheme that computes all the matching solut...

متن کامل

MapSQ: A MapReduce-based Framework for SPARQL Queries on GPU

In this paper, we present a MapReduce-based framework for evaluating SPARQL queries on GPU (named MapSQ) to largescale RDF datesets efficiently by applying both high performance. Firstly, we develop a MapReduce-based Join algorithm to handle SPARQL queries in a parallel way. Secondly, we present a coprocessing strategy to manage the process of evaluating queries where CPU is used to assigns sub...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014